Formal synthesis of optimal information-gathering policies

نویسندگان

  • Austin Jones
  • Mac Schwager
  • Calin Belta
چکیده

This paper considers the problem of informative path planning under temporal logic (TL) correctness constraints. For example, a robot deployed to a building after a natural disaster must explore the area and report possible locations of survivors with minimum uncertainty. The robot must satisfy the correctness requirement “Always avoid obstacles. Report data to rescuers at a data upload hub before exiting the scene. If a location with suspected fire is visited, investigate and report immediately.” In this work, we map the constrained informative path planning problem to a stochastic optimal control problem over a Markov decision process that integrates a robot’s motion under a given TL specification with its sensing model. We develop an optimal dynamic programming algorithm and a receding horizon approximation with significantly reduced computational cost. Both solutions are guaranteed to satisfy the given specification and are evaluated using simulations and experiments with ground robots. Our results show that the receding horizon solution approximates the optimal method closely, which indicates usefulness for persistent information gathering tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Forecasting and retrospection analysis of Tehran Municipality’s five years plan in human resource sector

Tehran Municipality has had two five-year plans so far and the third one is being formulated these days. However, investigating the sector relating to the human resources shows that there is a gap between present condition and the optimal situation in several domains. This article is an attempt to investigate the five-year plans in the domain of human resources via a practical model of forecast...

متن کامل

Control Theory and Economic Policy Optimization: The Origin, Achievements and the Fading Optimism from a Historical Standpoint

Economists were interested in economic stabilization policies as early as the 1930’s but the formal applications of stability theory from the classical control theory to economic analysis appeared in the early 1950’s when a number of control engineers actively collaborated with economists on economic stability and feedback mechanisms. The theory of optimal control resulting from the contributio...

متن کامل

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

The Knowledge Gradient for Optimal Learning

Optimal learning addresses the problem of how to collect information so that it benefits future decisions. For off-line problems, we have to make a series of measurements or observations before choosing a final design or set of parameters; for online problems, we learn from rewards we are receiving, and we want to strike a balance between rewards earned now and better decisions in the future. T...

متن کامل

Learning Multi-agent Search Strategies

We identify a specialised class of reinforcement learning problem in which the agent(s) have the goal of gathering information (identifying the hidden state). The gathered information can affect rewards but not optimal behaviour. Exploiting this characteristic, an algorithm is developed for evaluating an agent’s policy against all possible hidden state histories at the same time. Experimental r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014